Pronunciation Rules for Indian English Text-to-speech System

نویسنده

  • Aniruddha Sen
چکیده

Text-to-speech synthesis in Indian English is useful for delivering messages stored in computers and web to the Indian users unfamiliar with standard English accent. Such work is going on at TIFR and the paper reports the salient features of the front-end language processor that generates pronunciation plus stress information. The important components of the language processor are the parser to categorize words, an Indian English phonetic dictionary, morphological analyzer, letter-tosound rules, phonological rules, prosody rules and Indian name detector. The relevant rules are formulated with the aid of a large CMU pronunciation dictionary and a language tool GENEX, developed in-house, that can generate a sub-dictionary following a set of specified constraints. The paper outlines the rule formulation procedure and provides examples of various types of rules. A few important morphological rules and letter-to-sound rules are described in detail.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Synthesis for Mixed-Language Navigation Instructions

Text-to-Speech (TTS) systems that can read navigation instructions are one of the most widely used speech interfaces today. Text in the navigation domain may contain named entities such as location names that are not in the language that the TTS database is recorded in. Moreover, named entities can be compound words where individual lexical items belong to different languages. These named entit...

متن کامل

Outline: Applications of Neural Nets Nettalk -learning Pronunciation of English Text Classifying Sonar Targets 16.1 Nettalk 16.1.1 Overview Phoneme String Text Speech Figure 16.1: a Text-to-speech System Using Nettalk

NETtalk is a classic example of a back-propagation trained multi-layer perceptron network applied to a practical application. NETtalk, created by Sejnowski and Rosen-berg 1], applies a multi-layer network to the text-to-speech problem. The goal is to develop a system which can convert English text into its underlying sequence of phonemes and stress markers. The string of phonemes and stress mar...

متن کامل

Evaluating the Pronunciation Component of Text-to-Speech Systems for English: A Performance Comparison

The automatic derivation of word pronunciations from input text is a central task for any text-to-speech system. For general English text at least, this is often thought to be a solved problem, with manually-derived linguistic rules assumed capable of handling ‘novel’ words missing from the system dictionary. Data-driven methods, based on machine learning of the regularities implicit in a large...

متن کامل

An Algorithm for High Accuracy Name Pronunciation by Parametric Speech Synthesizer

Automatic and accurate pronunciation of personal names by parametric speech synthesizer has become a crucial limitation for applications within the telecommunications industry, since the technology is needed to provide new automated services such as reverse directory assistance (number to name). Within text-to-speech technology, however, it was not possible to offer such functionality. This was...

متن کامل

Lexical and Acoustic Adaptation for Multiple Non-Native English Accents

This work investigates the impact of non-native English accents on the performance of an large vocabulary continuous speech recognition (LVCSR) system. Based on the GlobalPhone corpus [1], a speech corpus was collected consisting of English sentences read by native speakers of Bulgarian, Chinese, German and Indian languages. To accommodate for non-native pronunciations, two directions are follo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003